Logical Methods in Computer Science 
Vol. 4(3:8) 2008, pp. 1-20 
www.lmcs-online.org 



Submitted Feb. 1,2007 
Published Sep. 13, 2008 



CONSISTENCY AND COMPLETENESS OF REWRITING IN THE 
CALCULUS OF CONSTRUCTIONS* 

DARIA WALUKIEWICZ-CHRZ4SZCZ AND JACEK CHRZASZCZ 

Institute of Informatics, Warsaw University, ul. Banacha 2, 02-097 Warsaw, Poland 
e-mail address: {daria,chrzaszcz}@mimuw. edu.pl 



Abstract. Adding rewriting to a proof assistant based on the Curry-Howard isomor- 
phism, such as Coq, may greatly improve usability of the tool. Unfortunately adding an 
arbitrary set of rewrite rules may render the underlying formal system undecidable and 
inconsistent. While ways to ensure termination and confluence, and hence decidability of 
type-checking, have already been studied to some extent, logical consistency has got little 
attention so far. 

In this paper we show that consistency is a consequence of canonicity, which in turn 
follows from the assumption that all functions defined by rewrite rules are complete. We 
provide a sound and terminating, but necessarily incomplete algorithm to verify this prop- 
erty. The algorithm accepts all definitions that follow dependent pattern matching schemes 
presented by Coquand and studied by McBride in his PhD thesis. It also accepts many 
definitions by rewriting including rules which depart from standard pattern matching. 



1. Introduction 

Equality is ubiquitous in mathematics. Yet it turns out that proof assistants based on 
the Curry- Howard isomorphism, such as Coq |11J . are not very good at handling equality. 
While proving an equality is not a problem in itself, using already established equalities is 
quite problematic. Apart from equalities resulting from internal reductions (namely, beta 
and iota reductions), which can be used via the conversion rule of the calculus of inductive 
constructions without being recorded in the proof term, any other use of an equality requires 
giving all details about the context explicitly in the proof. As a result, proof terms may 
become extremely large, taking up memory and making type-checking time consuming: 
working with equations in Coq is not very convenient. 

A straightforward idea for reducing the size of proof terms is to allow other equalities in 
the conversion, making their use transparent. This can be done by using user-defined rewrite 
rules. However, adding arbitrary rules may easily lead to logical inconsistency, making the 
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proof environment useless. It is of course possible to put the responsibility on the user, but 
it is contrary to the current Coq policy to guarantee consistency of developments without 
axioms. Therefore it is desirable to retain this guarantee when rewriting is added to Coq. 
Since consistency is undecidable in the presence of rewriting in general, one has to find some 
decidable criteria satisfied only by rewriting systems which do not violate consistency. 

The syntactical proof of consistency of the calculus of constructions, which is the basis 
of the formalism implemented in Coq, requires every term to have a normal form [2] . The 
same proof is also valid for the calculus of inductive constructions |24] , which is even closer 
to the formalism implemented in Coq. 

There exist several techniques to prove (strong) normalization of the calculus of con- 
structions with rewriting [U El l21j 122] , following numerous works about rewriting in the 
simply-typed lambda calculus. Practical criteria for ensuring other fundamental properties, 
like confluence, subject reduction and decidability of type-checking are addressed e.g. in [6]. 

Logical consistency is also studied in [6]. It is shown under the assumption that for 
every symbol / defined by rewriting, f(t±, . . . , t n ) is reducible if t\ . . . t n are terms in normal 
form in the environment consisting of one type variable. Apart from a proof sketch that 
this is the case for the two rules defining the induction predicate for natural numbers and 
a remark that this property resembles the completeness of definitions, practical ways to 
satisfy the assumption of the consistency lemma are not discussed. 

Techniques for checking completeness of definitions are known for almost 30 years for 
the first-order algebraic setting |14[ |20| [15] . More recently, their adaptations to type theory 
appeared in |12|IT6] and |18j . In this paper we show how the latter algorithm can be tailored 
to the calculus of constructions extended with rewriting. We study a system where the set 
of available function symbols and rewrite rules are not known from the beginning but may 
grow as the proof development advances, as it is the case with concrete implementations of 
modern proof assistants. 

We show that logical consistency is an easy consequence of canonicity, which in turn 
can be proved from completeness of definitions by rewriting, provided that termination 
and confluence are proved first. Our completeness checking algorithm closes the list of 
necessary procedures needed to guarantee logical consistency of developments in a proof 
assistant based on the calculus of constructions with rewriting. 

In fact, in this paper we work in a framework which is slightly more general than 
the calculus of constructions, namely that of pure type systems, of which the calculus of 
constructions is an instance. However, since termination and confluence are used both in 
our algorithm and in the proof of its correctness, our results are useful only if a termination 
and confluence criteria exist for a given pure type system extended with rewriting. Some 
work in this direction has been done, e.g., in jjj. 

2. Rewriting in the Calculus of Constructions 

Let us briefly discuss how we imagine introducing rewriting in Coq and what problems 
we encounter on the way to a usable system. 

From the user's perspective definitions by rewriting could be entered just as all other 
definitions Q 

^The syntax of the definition by rewriting is inspired by the experimental "recriture" branch of Coq 
developed by Blanqui. For the sake of clarity we omit certain details, like environments of rule variables and 
allow the infix + in the definition. 
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Inductive nat : Set := : nat | S : nat — > nat. 

Symbol + : nat — > nat — ► nat 

Rules 

+ y — ► y 

x + — ► x 

(S x) + y — ► S (x + y) 

x + (S y) — > S (x + y) 

x + (y + z) — ► (x + y) + z. 
Parameter n : nat. 

The above fragment can be interpreted as an environment consisting of the inductive defini- 
tion of natural numbers, symmetric definition by rewriting of addition and the declaration of 
a variable n of type nat. In this environment all rules for + contribute to conversion. For in- 
stance both \/x : nat. x + = x and Vx : nat. + x = x can be proved by Ax : nat. refl nat x, 
where refl is the only constructor of the Leibniz equality inductive predicate. Note that the 
definition of + is terminating and confluent. The latter can be checked by an (automatic) 
examination of its critical pairs. 

Rewrite rules can also be used to define higher-order and polymorphic functions, like 
the map function on polymorphic lists. In this example, the first two rules correspond to 
the usual definition of map by pattern matching and structural recursion and the third rule 
can be used to quickly get rid of the map function in case one knows that f is the identity 
function. 

Symbol map : forall (A:Set), (A — ► A) — > list A — ► list A 
Rules 

map A f (nil A) — ► nil A 

map A f (cons A a 1) — ► cons A (f a) (map A f 1) 
map A (fun x =>■ x) 1 — ► 1 

Even though we consider higher-order rewriting, we choose the simple matching modulo 
a-conversion. Higher-order matching is useful for example to encode logical languages by 
higher-order abstract syntax, but it is seldom used in Coq where modeling relies rather on 
inductive types. Instead of higher-order matching, one needs a possibility not to specify 
certain arguments in left-hand sides, and hence to work with rewrite rules built from terms 
that may be not typable. Consider, for example the type tree of trees with size, holding 
some Boolean values in the nodes, and the function rotr performing a right rotation in the 
root of the tree. 

Inductive tree : nat — ► Set : = 
Leaf : tree D 

I Node : forall nl:nat, tree nl — > bool — > forall n2:nat, tree n2 
— »■ tree (S(nl+n2)) . 

Symbol rotr : forall n:nat, (tree n) — ► (tree n) 
Rules 

rotr t — ► t 

rotr ? (Node D tl a n2 t2) — ► Node D tl a n2 t2 
rotr ?1 (Node ?2 (Node ?3 A b ?4 C) d ?5 E) 

— ► Node ?3 A b (S (?4 + ?5))(Node ?4 C d ?5 E) 
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The first argument of rotr is the size of the tree and the second is the tree itself. The first 
two rules cover the trees which cannot be rotated and the third one performs the rotation. 

The ? marks above should be treated as different variables. The information they hide 
is redundant for typable terms: if we take the third rule for example, the values of ?3, ?4 
and ?5 must correspond to the sizes of the trees A, C and E respectively, ?2 must be equal 
to S(?3+?4) and ?1 to S(?2+?5). Note that by not writing these subterms we make the rule 
left-linear (and therefore easier to match) and avoid critical pairs with +, hereby helping 
the confluence proof. 

This way of writing left-hand sides of rules was already used by Werner in |24] to define 
elimination rules for inductive types, making them orthogonal (the left-hand sides are of 
the form I e iim P f w (ex), where P, f, w, x are distinct variables and c is a constructor 
of /). In j6], Blanqui gives a precise account of these omissions using them to make more 
rewriting rules left-linear. Later, the authors of [8j show that these redundant subterms can 
be completely removed from terms (in a calculus without rewriting however). In [3], a new 
optimized convertibility test algorithm is presented for Coq, which ignores testing equality 
of these redundant arguments. 

In our paper we do not explicitly specify which arguments should/could be replaced 
by ? and do not restrict left-hand sides to be left-linear. Instead, we rely on an acceptance 
condition to suitably restrict the form of acceptable definitions by rewriting to guarantee 
the needed metatheoretical properties listed in the next section. 

It is also interesting to note that when the first argument of rotr is ?1 then we may 
understand it as S(?2+?5) matched to terms modulo the convertibility relation and not just 
syntactically (i.e., modulo a-conversion). 

3. Pure Type Systems with Generative Definitions 

Even though most papers motivated by the development of Coq concentrate on the 
calculus of constructions, we present here a slightly more general formalization of a pure 
type system with inductive definitions and definitions by rewriting. The presentation, taken 
from [9j [10] , is quite close to the way these elements could be implemented in Coq. The 
formalism is built upon a set of PTS sorts S, a binary relation A and a ternary relation 1Z 
over S governing the typing rules (Term/ Ax) and (Term/Prod) respectively (Figure [TJ. 
The syntactic class of pseudoterms is defined as follows: 

t ::= v | s | t\ ti | Xv.t1.t2 I (v:ti)t2 
A pseudoterm can be a variable v G Var, a sort s 6 S, an application, an abstraction 
or a product. We write \t\ to denote the size of the pseudoterm t, with \v\ = |s| = 1. 
We use Greek letters 7, 5 to denote substitutions which are finite partial maps from vari- 
ables to pseudoterms. The postfix notation is used for the application of substitutions to 
pseudoterms. 

Inductive definitions and definitions by rewriting are generative, i.e. they are stored in 
the environment and are used in terms only through names they "generate". An environ- 
ment is a sequence of declarations, each of them is a variable declaration v : t, an inductive 
definition Ind(r 7 := r ), where T 1 and T c are environments providing names and types 
of (possibly mutually defined) inductive types and their constructors, or a definition by 
rewriting Rew(r,72), where T is an environment providing names and types of (possibly 
mutually defined) function symbols and R is a set of rewrite rules defining them. Types of 
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inductive types, constructors and function symbols determine their arity: given v : t in an 
inductive definition or a definition by rewriting, if t is of the form (x\ : t\) . . . (x n : t n )t where 
t is not a product, then n is the arity of v. 

A rewrite rule is a triple denoted by A h I — ► r, where I and r are pseudoterms and 
A is an environment, providing names and types of variables occurring in the left- and 
right-hand sides I and r. 

Given an environment E, inductive types, constructors and function symbols declared 
in E are called constants (even though syntactically they are variables). We often write 
h{e\, . . . , e n ) to denote the application of a constant h to pseudoterms e±, . . . , e n , when n is 
the arity of h. General environments are denoted by E and environments containing only 
variable declarations are denoted by T, A, G, D. We assume that names of all declarations 
in environments are pairwise disjoint. A pair consisting of an environment E and a term e 
is called a sequent and denoted by E h e. A sequent is well-typed if E h e : t for some t. 

Definition 3.1. A pure type system with generative definitions is defined by the typing 
rules in Figure [TJ where: 

• POS is a positivity condition for inductive definitions (see assumptions below). 

• ACC is an acceptance condition for definitions by rewriting (idem). 

• The relation ~ used in the rule (Term/Conv) is the smallest congruence on well typed 
terms, generated by — > which is the sum of beta and rewrite reductions, denoted by 
— >p and — >r respectively (for exact definition see [TO], Section 2.8). 

• The notation 5 : T — * E means that 5 is a well-typed substitution, i.e. E h v5 : t5 for all 
v : t £ r. 

As in |22j [H] , recursors and their reduction rules have no special status and they are supposed 
to be expressed by rewriting. 

Assumptions. We assume that we are given a positivity condition POS for inductive def- 
initions and an acceptance condition ACC for definitions by rewriting. Together with the 
right choice of the PTS they must imply the following properties: 
PI subject reduction, i.e. E h e : t, E h e — > e' implies E \- e' : t 
P2 uniqueness of types, i.e. E h e : t, E h e : t' implies E h t ~ t! . 

P3 strong normalization, i.e. E h ok implies that reductions of all well-typed terms in E 
are finite 

P4 confluence, i.e. E h e : t, E h e — >* e', J? h e — >* e" implies E h e' — >* e and 
h e" — >* e for some e. 

These properties are usually true in all well-behaved type theories. They are for example 
all proved for the calculus of algebraic constructions [6] , an extension of the calculus of con- 
structions with inductive types and rewriting, where POS is the strict positivity condition 
as defined in [T7], and ACC is the General Schema. 

From now on, we use the notation t[ for the unique normal form of t. 

4. CONSISTENCY AND COMPLETENESS 

Consistency of the calculus of constructions (resp. calculus of inductive constructions) 
can be shown by rejecting all cases of a hypothetical normalized proof e of (x : *)x in a 
closed environment, i.e. empty environment (resp. an environment containing only inductive 
definitions and no axioms). Our goal is to extend the definition of closed environments to the 
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Let T 1 = h : t[ . . . I n : ti and r c = a : if . . . c m : t 



Eht]: Sj tf = (z: Zj) s'j for j = 1 ... n 

E; T 1 h t c : §i tf = (z : Z'S IyW for t = 1 . . . m , T ~ 

— - — / 1/T A if POS E (T 1 := T c ) 

E h Ind(r 7 := T c ) : correct V 7 

Let T = f 1 :t 1 ...f n :t n and i? = {I 1 , : k — ► ri} i=1 ... m , where Ti = x\ :t\;...; x l n . : t l n . 

E h tfc : Sfc for A; = 1 ... n 
E; 1^1- ok F^r^CTi for i = 1 . . . m 



E h Rew(r,i?) : correct 



if ACC E (T,R) 



Eh ok E h t : s 



e h ok E; i> : i h ok 

£hok Eh Ind(r / := r c ) : correct E h ok Eh Rew(T, i?) : correct 

E;Ind(r 7 1=^) h ok E; Rew(r, R) h ok 

Er,v : i;E 2 h ok 
Ei;v : t;E 2 h t> : i 

Eh ok Eh ok , / ^=^i;ind(r^ =r^);E 2 

f ~ where < V 1 = h : t\ . . . I n : ti 

EhI * :t ' Ehc * :t ? I vo = Cl ..lc... Cm r t c m 

Ehok Ehok 5:T,^E , [ E = E i; Rew(r, i?); E 2 

where < V = /i : h .../„: t n 



Ehf * :U EhhS^nnS [R = {T t :k^ r< } <=1 .„ m 

(Term/Prod) (Term/Abs) (Term/Ax) 

Ehti:si E;^:tiht 2 :s 2 E;v:hhe:t 2 E h {v:t\)t 2 : s E h ok 

E h (u : £i)t 2 : s 3 E h Xv.ti.e : (u:ti)t 2 E h si : s 2 

where si,s 2 , s 3 G <S where (si, s 2 ) G .4 

(Term/App) (Term/Conv) 

E h e : (v:ti)t 2 E h e' : ti E h e : t E h f : s Bhtwf 
E h e e' : t 2 {w ^ e '} E h e : t' 

Figure 1: Definition correctness, environment correctness and lookup, PTS rules 



calculus of constructions with rewriting, allowing it to include a certain class of definitions 
by rewriting. 

Let us try to identify that class. If we reanalyze e in the new setting, the only new 
possible normal form of e is an application /(e) of a function symbol /, coming from a 
rewrite definition Rew(r,i?), to some arguments in normal form. There is no obvious 
argument why such terms cannot be proofs of (x : *)x. On the other hand if we knew 
that such terms were always reducible, we could complete the consistency proof. Let us 
call COMP(r,i?) the condition on rewrite definitions we are looking for (i.e. /(e) is always 
reducible), which can also be read as: the function symbols from T are completely defined 
by the set of rules R. 
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Note that the completeness of / has to be checked much earlier than it is used: we 
use it in a given closed environment E = E\\ Rew(T, R); Ei but it has to be checked when 
/ is added to the environment, i.e. in the environment E\. It implies that completeness 
checking has to account for environment extension and can be performed only with respect 
to arguments of such types, of which the set of normal forms could not change in the future. 
This is the case for arguments of inductive types. 

The requirement that functions defined by rewriting are completely defined could very 
well be included in the condition ACC. On the other hand, the separation between ACC 
and COMP is motivated by the idea of working with abstract function symbols, equipped 
with some rewrite rules not defining them completely. For example if + from Section [2] 
were declared using only the third rewrite rule, one could develop a theory of an associative 
function over natural numbers. 

The intuition behind the definitions given below is the following. A rewrite definition 
Rew(r, R) is complete (satisfies COMP(r,i?)) if for all / in T, the goal f(x±, . . . , x n ) is 
covered by R. A goal is covered if all its instances are immediately covered, i.e. head- 
reducible by R. Following the discussion above we limit ourselves to normalized canonical 
instances, i.e. built from constructors wherever possible. 

Definition 4.1 (Canonical form and canonical substitution). Given a judgment E h e : t 
we say that the term e is in canonical form if and only if: 

• if t{ is an inductive type then e = e(e 1; . . . , e n ) for some constructor c and terms e\, . . . ,e n 
in canonical form 

• otherwise e is arbitrary 

Let A be a variable environment and E a correct environment. We call 5 : A — ► E canonical 
if for every variable x 6 A, the term x5 is canonical. 

From now on, let E be a global environment and let Rew(r, R) be a rewrite definition 
such that E h Rew(r, R) : correct. Let / : [x\ : ti) . . . (x n : t n ) t G T be a function symbol 
of arity n. 

Definition 4.2. A goal is a well-typed sequent E;T;A\- f(e%, . . . , e n ). 

A normalized canonical instance of the goal E;T;A h f{e\, . . . ,e n ) is a well-typed 
sequent E; Rew(T, R); E' h f{e\5{, . . . ,e n 5[) for any canonical substitution 5 : A — > 
E;Rew(T,R);E'. 

A term e is immediately covered by R if there is a rule G h I — * r in R and a 
substitution 7 such that Z7 = e. By obvious extension we can also write that a goal or a 
normalized canonical instance is immediately covered by R. 

A goal is covered by R if all its normalized canonical instances are immediately covered 
by R. 

Note that, formally, a normalized canonical instance is not a goal. The difference is 
that the conversion corresponding to the environment of an instance contains reductions 
defined by R, while the one of a goal does not. 

Definition 4.3 (Complete definition). A rewrite definition Rew(r; R) is complete in the 
environment E, which is denoted by COMP^r; R), if and only if for all function symbols 
/ : (xi : ti) . . . (x n : t n ) t € T the goal E; T; x\ : t\\ . . . ; x n : t n h f(xi, . . . , x n ) is covered 
by R. 
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Example 4.4. The terms (S 0), Ax : nat . x and (Node D Leaf true D Leaf) are canoni- 
cal, while (D + D) and (Node nA A b D Leaf) are not. Given the definition of rotr from 
Section [2] consider the following terms: 

ti = rotr (S (nA + nC)) (Node nA A b nC C) 
t 2 = rotr (S 0) (Node D Leaf true D Leaf) 

Both (with their respective environments) are goals for rotr, and t 2 (with a slightly different 
environment) is also a normalized canonical instance of t\. The goal t\ is not immediately 
covered, but its instance t 2 is, as it is head-reducible by the second rule defining rotr. Since 



other instances of t\ are also immediately covered, the goal is covered (see Example 5.20). 



It follows that completeness of definitions by rewriting guarantees canonicity and logical 
consistency. 

Definition 4.5. An environment E is closed if and only if it contains only inductive defini- 
tions and complete definitions by rewriting, i.e. for each partition of E into E\ ; Rew(r, R) ; E 2 
the condition COMP^^r, R) is satisfied. 

Lemma 4.6 (Canonicity). Let E be a closed environment. If E h e : t and e is in normal 
form then e is canonical. 

Proof. By induction on the size of e. If i j is not an inductive type then any e is canonical. 
Otherwise, let us analyze the structure of e. It cannot be a product, an abstraction or a 
sort because t[ is an inductive type. Since E is closed, it is not a variable either. Hence 
e is of the form e'e\ . . . e m (with m possibly equal 0), where e' is not an application. The 
term e' can be neither a product, nor a sort (they cannot be applied), nor a variable {E is 
closed). It is not an abstraction, since e is in normal form. The only possibility left is that 
e' is a constant h of arity n < m, and we get e = h(e±, . . . , e n ) e n+ \ . . . e m . 

Since t{ is an inductive type, h cannot be an inductive type. If it is a construc- 
tor then n = m and by induction hypothesis ei,...,e n are in canonical form and so 
is h(e±, . . . ,e n ). If h is a function symbol then E = E\\ Rew(r, R); E 2 for some E±,E 2 
and h : {x\ : t\) . . . (x n : t n )t G V of arity n < m. Since E is closed, Rew(r, R) is com- 
plete. Let us show that E h h(e\, . . . , e n ) is a normalized canonical instance of E\ \ T; A h 
h(xi, . . . , x n ), where A = x\ : t\;...\x n : t n . By induction hypothesis, terms e%, . . . e n 
are canonical and consequently 5 : A — > E defined by 5{xj) = e% is canonical. Moreover, 
h(ei, . . . , e n ) = h{x\8{, . . . , x n 5[) since e±, . . . e n are in normal form. But every normalized 
canonical instance of a complete definition is reducible, which contradicts the assumption 
that e = h(ei, . . . , e n ) e n+ \ ... e m is in normal form. □ 

Theorem 4.7. Every closed environment is consistent. 

Proof. Let £ be a closed environment. Suppose that E h e : (x : *)x. Since E h ok and 
E h Ind(False : * :=) : correct we have E 1 h ok where E' = E; Ind(False : * :=). Moreover 
E' is a closed environment. 



Hence, we have £' h e False : False. By Lemma 4.6 the normal form of e False is 



canonical. Since False has no constructors, this is impossible. D 
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5. Checking Completeness 
The objective of this section is to provide an algorithm for checking completeness of 



definitions by rewriting. The algorithm presented in Subsection 5.2 checks that a goal is 



covered using successive splitting (Definition 5.3), i.e., replacement of variables of inductive 



types by constructor patterns. In order to know which constructor terms can replace a 
given variable, one has to compare types and hence an algorithm for unification modulo 



conversion is needed (Definition 5.2 ). Consider for example the first rule of the definition of 
rotr. It is clear that only Leaf can replace t in rotr t because other trees have types 
that do not unify with tree 0. 



Correctness of the completeness checking algorithm is proved in Lemma 5.19 It is done 
using an additional assumption on rewrite systems called preservation of reducibility which 
is discussed in Subsection 15.11 



Definition 5.1 (Unification problem). A quadruple E, A h t = s, where E is an environ- 
ment, A a variable environment and s, t are terms, is a unification equation in E. A uni- 
fication problem in E is a finite set of unification equations. Without loss of generality we 
may assume that the variable environments A in all equations are the same. 

A unifier or a solution of the unification problem U is a substitution 7 : A — > E; E 1 
such that E; E' h £7 ~ 57 for every E, A h t = s in U. We say that E is the co-domain 
of 7, which is denoted by Ran{^f). 

A unifier 7 is the most general unifier if Ran(j) is a variable environment A' and 
for every unifier 5 : A — > E;E" there exist a substitution 6' : A' — ► E;E", such that 
E;E" \- Star, 

Definition 5.2 (Correct unification algorithm). A unification algorithm is a procedure 
which for every unification problem U = {E, A h ti = s{\ returns a substitution 7, a 
bottom _L, or a question mark ?. The algorithm is correct if and only if: if it answers 7, it 
is the most general unifier 7 : A — > E; A' such that A' C A and for all x £ A', 7(2;) = x; if 
it answers _L, U has no unifier. 

Since unification modulo conversion is undecidable, every correct unification algorithm 
must return ? in some cases, which may be seen as too difficult for the algorithm. An 
example of such a partial unification algorithm is constructor unification, that is first-order 
unification with constructors and type constructors as rigid symbols, answering ? whenever 
one compares a non-trivial pair of terms involving non-rigid symbols. 

From now on we assume the existence of a correct (partial) unification algorithm Alg. 

Definition 5.3 (Splitting). Let E; T; A h /(e) be a goal. A variable x is a splitting variable 
if x : t S A and t J = Iu for some inductive type I £ E. 

A splitting operation considers all constructors c of the inductive type I and for each 
of them constructs the following unification problem U c : 

E; r, A; A c h x = c(z\, . . . Zk) E; T, A; A c h Iu = Iw 

where c : (z± : Z\) . . . (2% : Zk).Iw and A c = z\ : Z\, . . . , Zk : Zk- 

If for all constructors c, Alg(U c ) 7^ ?, the splitting is successful. In that case, let 
Sp(x) = {a c I a c = Alg(U c ) A Alg{U c ) 7^ _L}. The result of splitting is the set of goals 
{E;T; Ran(a c ) h /(e> c } CTc eS P (x)- 

If Alg(U c ) = ? for some c, the splitting fails. 
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Example 5.4. If one splits the goal rotr n t along n, one gets two goals: rotr D t and 
rotr (S m) t. The first one is immediately covered by the first rule for rotr and if we 
split the second one along t, the Leaf case is impossible, because tree does not unify 
with tree (S m) and the Node case gives rotr (S (nA + nC) ) (Node nA A b nC C) . 

The following lemma states the correctness of splitting, i.e. that splitting does not 
decrease the set of normalized canonical instances. Note that the lemma would also hold 
if we had a unification algorithm returning an arbitrary set of most general solutions, but 
in order for the coverage checking algorithm to terminate the set of goals resulting from 
splitting must be finite. 

Lemma 5.5. Let fi;T;Ah /(e) be a coverage goal and let {E; T; Ran(a c ) h f(e)&c}a c £Sp(x) 
be the result of successful splitting along x : Iu € A. Then every normalized canonical 
instance of E;T;A h /(e) is a normalized canonical instance of E;T; Ran(a c ) h f(e)a c for 
some a c 6 Sp(x). 

Proof. Let E; Rew(r, R); E' h /(ei<5j, . . . , e n 5\) be a normalized canonical instance ac- 
cording to a substitution 5 : A — > E; Rew(r, R); E' . Since 5 is canonical, xS is a con- 
structor term c(si,...Sk) for some constructor c : (z\ : Z±)...(zk ■ Z^).Iw of I. Let 
us show that E; Rew(r, R); E' h f(e±8l, . . . , e n <5j) is a normalized canonical instance of 
E;T; Ran(a c ) h /(ei<7 c , . . .,e n a c ). Let A c = z\ : Z\, . . . ,z k : Z k . 

First note that 5u[s/z\ : A; A c — ► E; Rew(r, R); E' is a solution of the unification 
problem E;T, A; A c h x = c(zi,...Zfc) and A;A C h Iu = Iw, from the definition 

of splitting. Indeed, x<5 = c(si, . . . s^) = c(zi, . . . Zk)[s/z\ and E; Rew(r, R); E' h (7tt)(5 ~ 
(ItZ;)[s/i] since they are both types of c(si, ■ ■ ■ Sk) in I?; Rew(r, i?); E' . 

By definition of a c , which is the most general unifier computed by a correct unification 
algorithm, E;T; Ran(a c ) h (5 U [s/z] ps d c ; 5' for some <5' : Ran(a c ) — > -E; Rew(r, i?); E 1 ', 
where Ran{a c ) Q A; A c . Consequently, T; Ran{a c ) h <5'(z m ) ~ s m for z m G A c and 
E;T; Ran(a c ) \~ S'(y) ps <5(y) for y £ A. Since s are canonical terms (5'J. is a canonical 
substitution. 

Let us look closely at £7; Rew(r, i?); £" h /((eia c )((5 / |)|, ■ ■ • , (e n ^ c )(<5'|)|) which is 
a normalized 5'j-instance of E;T; Ran(a c ) h f (eia c , . . . , e n a c ) . Since E;T; Ran(a c ) h 
<5 U [s/z] f» C7 C ;<5', we have (e m £j c )(5 / J,)J, = (e m o" c 5')J, = (e m <5)| for every m. Consequently, 
E 1 ; Rew(r, £" h /(ei^J,, . . . , e n 5[) is a normalized canonical instance of E; T; Ran(a c ) h 
/(eicr c , . . . ,e n cr c ). □ 

5.1. Preservation of Reducibility. Although one would expect that an immediately 
covered goal is also covered, it is not always true, even for confluent systems. It turns 
out that we need a property of critical pairs that is stronger than just joinability. Let us 
suppose that or : bool — > bool — > bool is defined by four rules by cases over true and 
false and that if : bool — > bool — > bool — > bool is defined by two rules by cases on 
the first argument. 

Inductive I : bool — > Set := C : forall b:bool, I (or b b) . 

Symbol f: forall b:bool, I b -> bool 

Rules 

f (or b b) (C b) — ► if b (f true (C true)) (f false (C false)) 
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In the example presented above all expressions used in types and rules are in normal form, 
all critical pairs are joinable, the system is terminating, and splitting of f b i along i 
results in the only reducible goal f (orbb) (Cb).In spite of that f is not completely 
defined, as f true (C true) is a normalized canonical instance of f (or b b) (C b) and 
it is not reducible. In order to know that an immediately covered goal is always covered we 
need one more condition on rewrite rules, called preservation of reducibility. 

Definition 5.6. Definition by rewriting Rew(r, R) preserves reducibility in an environment 
E if for every critical pair (f(u),r5) of a rule G\ h /(e) — ► r in R with a rule G2 \~ g — ► d 
coming from R or from some other rewrite definition in E, the term f{u[) is head-reducible 
by R. 

Note that by using ? variables in rewrite rules one can get rid of (some) critical pairs and 
hence make a definition by rewriting satisfy this property. In the example above one could 
write f ? (C b) as the left-hand side. This would also make the system non-terminating, 
and show that / is not really well-defined. 

Of course all orthogonal rewrite systems, in particular inductive elimination schemes, 
as defined in [24, preserve reducibility. 

Lemma 5.7. Let E h e : t and e = f(e±, . . . e n ), where f of arity n comes from Rew(r, R) 
which preserves reducibility. If e is head- reducible by R then /(eij, . . . e n {) is also head- 
reducible by R. 

Proof. By induction on — >. If e\, . . . e n are in normal forms then the conclusion is obvious. 
Otherwise, let G\ h f(l) — * r be a rule from R and 7 a substitution such that /(e) = /(£)7 
and let us make one reduction step ei — ► e' i} using the rule Gi h g — > d. 

There are two possibilities: the reduction in ej happens either in substitution 7, i.e. 
in the term 7(2;), where x is a free variable of /(/), or it happens on a position p that 
belongs to f{l). In the former case, let us do identical reduction in all other instances of x. 
Obviously, we get a term /(e^, . . . e' n ) that is smaller than e in — > and is still an instance 
of /(/). Hence by induction hypothesis we get the desired conclusion. 

Otherwise, f{l) and g superpose at some nonvariable position and we have /(/)| p 7 = <?£ 
for some position p and substitution £. Since we may suppose that free variables of f{l) 
and g are different, we get f(l)\ p (j U = <?(7 U £)• Let 5 be the most general unifier of 
f(l)\ p and g and let (f(u),rS) be the corresponding critical pair. Since 5 is the most general 
unifier, there exists a such that (7 U £) = 5; a and /(e) = /(/)7 = /(0(7 U £) = f(l)8a 
with f(l)6a f(u)a = /(ei, . . . e\ . . . e n ). By preservation of reducibility f{u[) is head- 
reducible by R. Hence f{u\)o~ is also head-reducible by R. Like above we can apply 
induction hypothesis and deduce that f(e[) is head-reducible by R. □ 

Lemma 5.8. Let Rew(r,i?) preserve reducibility in an environment E, let f € T and let 
i?;T;AI- /(e) be a goal. If it is immediately covered then it is covered. 

Proof. Let E; T; A h /(e) be a goal immediately covered by R and 5 : A — » E; Rew(r, R); E' 
be a canonical substitution. Obviously, E; Rew(T, R); E' h f(ed) is immediately covered 
by R. Hence, by Lemma 5.7 E; Rew(r, R); E' h f{e5{) is also immediately covered by R, 
i.e. E;T;A\- /(e) is covered. □ 
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tl "^ti-I ti = r > Si to <Cr)-9 to =7* So 


tl "Cn-I ti =r* 5*1 to < Cr»-9 to =7* *S*9 


(x:ti)t 2 < P Si U S 2 


Xx :h.to < v Xx : t', .t' 9 =>• 5i U 5 2 


6 1 ^p-1 6 i ^ ^l 


°2 ^p-2 v 2 2 


tl t2 <p t\ 


t' 2 => 5i U 5 2 


ti < P -i t 1 =>• 5i ... t n 


<p-n t n =^ S n 

— £ -— — ft is a constant 


/i(ti,...,t n ) < p h{^,...,t' n ) 


=>• S 1 ! U • • • U S n 


h,t2 G" Var head(ti) / head(t 2 ) 


heAi t 2 g{Var\A 2 ) 


h < P t 2 {-L} 


h < P t 2 =>- {p} 


(ti G (For \ Ai) Ati = t 2 )V 


(ti G (Vor\ Ai) Ati / t 2 )V 


(ti Far A t 2 G A 2 ) 


(t 2 G (7ar\A 2 ) Ati /t 2 ) 


ti < p t 2 =^ 


ti < P t 2 => {-L} 



Figure 2: Splitting matching rules, parametrized by Ai, A 2 



5.2. Coverage Checking Algorithm. In this section we present an algorithm checking 
whether a set of goals is covered by the given set of rewrite rules. The algorithm is correct 
only for definitions that preserve reducibility. The algorithm, in a loop, picks a goal, checks 
whether it is immediately covered, and if not, splits the goal replacing it by the subgoals 
resulting from splitting. In order to ensure termination, splitting is limited to safe splitting 
variables. Intuitively, a splitting variable is safe if it lies within the contour of the left-hand 
side of some rule when we superpose the tree representation of the left-hand side with the 
tree representation of the goal. The number of nodes that have to be added to the goal in 
order to fill the tree of the left-hand side is called a distance, and a sum of distances over 
all rules is called a measure. Since the measures of goals resulting from splitting are smaller 
than the measure of the original goal, the coverage checking algorithm terminates. 

This subsection is organized as follows. We start by defining the splitting matching 
algorithm which is used to define safe splitting variables. Next, we provide definitions and 
lemmas needed to prove termination of the coverage checking algorithm and then we give 
the algorithm itself and the proof of its correctness. We conclude this subsection with some 
positive and negative examples leading to an extension of the algorithm allowing us to 
accept definitions by case analysis even if the unification algorithm is not strong enough. 

Let us start with the splitting matching algorithm which finds variables in ti that lie 
within the contour of t 2 . 

Definition 5.9 (Splitting matching). The splitting matching algorithm is defined in Fig- 
ure [2} Given two sequents Ai h ti and A 2 \- t 2 , it returns the unique set S, such that 
h <a *2 S is derivable. The set S is a subset of {_L} U {p G Pos(ti) \ ti\ p G Ai}. 

Definition 5.10 (Safe splitting variable). Let Ai h ti and A 2 h t 2 be sequents such that 
t 2 is a left-hand side of a rule from R and let S be a set such that ti <a t 2 S and 1^5. 
A variable x G Ai is a safe splitting variable for A± h ti along A 2 h t 2 if it is a splitting 
variable and there exists p G S such that t\\ p = x and either t 2 | p is a variable declared in 
A 2 or t 2 | p = c(e) for some constructor c and some terms e. 
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The set of safe splitting variables for the sequent Ai h t\ along A2 h t 2 is denoted by 
SV(Ai h tx, A2 h t 2 ) or <SV(ti,t2) for short. SV(t,R) is the set of safe splitting variables 
for t along left-hand sides of rules from R. 

Example 5.11. In the goal rotr (S (S nC)) (Node Leaf b (S nC) C) there are two 
safe splitting variables b and C along the left-hand sides of the rules defining rotr. 

Definition 5.12 (Distance). Let Ai h tx and A2 h £2 be sequents and S be a set such 
that tx <a ti S. If -L ^ S then the distance of Ai h from A2 I - i 2 , denoted by 
dzst(Ai h tx, A2 h or disf(fi,i 2 ), equals X^pes l^lpll otherwise it is equal to 0. 

The following two lemmas state that the distance of a term decreases when we apply a 
substitution, and it decreases strictly if it is a substitution resulting from splitting. 

Lemma 5.13 (Distance of a substituted sequent). Let Ai h tx and A2 \~ i 2 be sequents and 
let S be a set such that tx <a ^2 =^ S. Then for every substitution 7 : Ai — > A' we have 
dist(A' h ii7,A 2 h i 2 ) < dist(Ax h ti,A 2 h t 2 ). 

Moreover, if _L G 5 i/ien dist(A' h ii7, A 2 h i 2 ) = dist(Ax h ii, A 2 h i 2 ) = 0. 

Proof. Let 5 7 be a set such that £17 <a i 2 =^ £7 and let us denote cfo£(Ai h ii, A 2 h i 2 ) 
by d and dist(A' h ^7, A 2 h t 2 ) by d 7 . 

If _L G S then d = 0. Note that _L G S if and only if there is a position j> such that 
subterms occurring at p in ti and £ 2 either have different head symbols, or i 2 | p (resp. ii| p ) 
is a bound variable in i 2 (resp. ii) and ii| p 7^ i 2 | p . Of course, if we compare ii7| p and t 2 |p 
then either they still have different head-symbols or t 2 | p (resp. tx\ p ) is a bound variable and 
txj\ p 7^ t 2 | P - Hence d 7 = 0. 

If _L G" S then d = X^eS 1 l* 2 lpl — K -L G <S 7 then obviously = d 7 < d. Otherwise, let 
us take p £ S and the set Q p = {g G 5 7 | p ■< q}, where < is the prefix ordering. Since all 
positions from Q p are independent (as tx'flq G Var for every q G 5 7 ) we have X^eQp l*2|g| < 
|i 2 | p | and the equality holds only if Q p = {p}. Let us show that Vg G 5 7 3p G 5 p ^ g. 
Indeed, assuming that _L S^, q G 5 7 either because q £ S and (£i| 9 )7 G A' or because 
there is a position p G S* such that q = p - q' for some g' and (ii| P 7)| 9 ' G A'. Of course, since 
positions in S are independent, the sets Q p are disjoint for different p. 

Hence 5 7 = U peS Qp and d ~i = E g6 5 7 M?l = E pe s E g6Qp Mg| < E p6 s MpI = d. □ 

Lemma 5.14 (Distance after splitting strictly decreases). Let E; T; A h /(ei, . . . , e n ) 6e a 
goal, i = /(ei, . . . , e n ), let G h / — ► r 6e one 0/ i/ie rewrite rules for f in R and let S be 
a set such that t <\ I =>■ 5 and _L <S. If x : Iu £ SV(t, I) is a safe splitting variable and 
splitting t along x is successful then dist(Ran(a c ) h ta c , G H) < dist(A \- t,G \- I) for 
every a c G Sp(x). 



Proof. Let o~ c G Sp(x) and let S c be a set such that ta c <a Z =>■ S c . By Lemma 5.13| we 



have dist(ta c , I) < dist(t, I). Let us analyze the proof of that lemma and show that in case 
of a substitution resulting from splitting there is a strict inequality between dist(ta c , I) and 
dist(t,l). In the proof it was noticed that for every p G S, SgeQ < where Q p = 
{g G 5 C I p ^ q} and that X] g eQ P KM = KIpI on ^ ^ = {^i- Consequently, if we show 
that there exists a position p such that p Q p , we immediately get dist(ta c , I) < dist(t, I). 

Since x : Iu is a safe splitting variable for t along Z, there exists a position p £ S 
such that i| p = x and /| p G G or /| p = c'(a) for some constructor c'. Since a c results from 
successful splitting, xa c = c(b) for some b. Now, there are three cases. If l\ p £ G then 
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to~ c <A I => O) Qp = (and hence p G" Q p ). Otherwise, if d ^ c then we fall into an 
easy case when _L G S c and dist(to~ c ,l) = < |c'(a)| < dist(t,l). Finally, if d = c, the 
computation of S c passes through the step c(b) < p c(a) _. This means that all positions 
in Q p come from b and that they are longer than p or that Q p = 0. Thus p G" Q p and 
dist(ta c ,l) < dist(t,l). □ 

Definition 5.15 (Measure of a goal). Let E; T; A h /(ei, . . . , e n ) be a goal and let i?/ = 
{Gi h /j — > ?"i}i=i...m be the set of rules for /. The measure of E;T; A h f(e\, . . . ,e n ) 
equals X)i=i...m ^(A !~ /( e i> • • • > e n), Gi h Zj). 

It follows directly from Lemmas |5.13| and 5.14| that the measure of a goal strictly 



decreases after applying a substitution resulting from splitting. 

Lemma 5.16 (Measure after splitting strictly decreases). Let E;T; Ah f(e\, . . . , e n ) be a 
goal, t = f(e\, e n ), Rf = {Gi h k — ► rj}j=i... m 6e the set of rules for f and S be a set 
such that t <a lj S for some j G {1, . . . , m} and _L g" 5. If x : Iu £ SV(t, R) is a safe 
splitting variable and {(pi, . . . , (p n } is the result of successful splitting of t along x then the 
measure of every (pi is strictly smaller than the measure oft. 

Proof. For every a c £ Sp(x), we have to show that Yli=i m dist(to~ c , k) < Yli=i m dist(t, k). 



This follows from dist(ta c ,L) < dist(t,lj), which is the consequence of Lemma 5.14 and 



dist(ta c , li) < dist(t, li) for all i = 1 . . . m, i ^ j, which follows from Lemma 5.13 D 



Definition 5.17 (Coverage checking algorithm). Let W be a set of pairs consisting of a 
goal and a set of safe variables of that goal along left-hand sides of rules from R and let 
CE be a, set of goals. The coverage checking algorithm works as follows: 

Initialize 

W = {(E; T; x\ : t\\ ■ ■ ■ ; x n : t n h f(x 1} ...,x n ), SV(f(x 1 , x n ),R))} 
CE = % 

Repeat 

(1) choose a pair (4>,X) from W, 

(2) if (p is immediately covered by one of the rules from R then 
W:=W\{((j>,X)} 

(3) otherwise 

(a) if X = then W := W \ {{(f), X)}, CE := CE U {(/>} 

(b) otherwise choose x G X; split <p along x 

(i) if splitting is successful and returns {(pi, . . . , (p n } then 
W:=W\ {{(P, X)} U {(&, SVfa, R))}i=i...n, 

(ii) otherwise W := W \ {{<f>, X)} U {(<£, X \ {x})} 
until W = 

Lemma 5.18. T/ie cover checking algorithm terminates. 

Proof. Let us consider the following measure M(W): the multiset of lexicographically or- 
dered pairs consisting of the measure of <p and the size of X, for all {(p, X) G W. We will 
show that every loop of the algorithm strictly decreases M(W). Consider ((f), X) G W. If (p 
is immediately covered then obviously the measure of W \ {((f), X)} is strictly smaller than 
the measure of W. Otherwise, we split <p along some x G X. If splitting fails then ((f), X) is 
replaced by ((p, X \ {x}) and the size of the second component strictly decreases. If splitting 
is successful and returns {(pi, . . . , (fr n } then ((f>, X) is replaced by {((pi, SV((pi, R))}i=i... n - By 
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Lemma 5.16 the measures of goals from {<ft\, . . . , 4> n } are strictly smaller than the measure 



of (ft and consequently M(W) strictly decreases. □ 

Lemma 5.19. If Rew(T, R) preserves reducibility and the algorithm stops with CE = 
then the initial goal is covered. 

Proof. Let us consider a successful run of the algorithm, performing a finite number of times 
the body of the Repeat loop and resulting in CE = 0. By induction on n, the number of 
Repeat steps until the end of the algorithm, we prove that the goals appearing in W are 
covered. 

The base case, for n = 0, is trivial since Wq is empty. 

Now suppose that n steps before the end of the algorithm all goals in W n are covered 
and let us check that this was true n + 1 steps before the end, i.e. one step of the algorithm 
earlier. 

In case 2, W n +i contains all goals from W n and one goal (ft which is immediately covered 



by a rule in R. By preservation of reducibility (Lemma 5.8) every normalized canonical 
instance of (ft is also immediately covered and consequently all goals of W n +i are covered. 
Case 3(a) is impossible since it makes the set CE non-empty. 

In case 3(6) i, W n +i contains some of the goals from W n and one goal (ft whose subgoals 



resulting from successful splitting are already in W n . By Lemma 5.5 the set of normalized 
canonical instances of these subgoals contains the set of normalized canonical instances of <ft. 
Hence W n +i is covered. 

In case 3(6) ii the set of goals in W n +i and W n are equal. 

Hence the initial goal in W is also covered. □ 
Example 5.20. The beginning of a possible run of the algorithm for the function rotr is 



presented already in Example 5.4. Both splitting operations are performed on safe vari- 
ables, as required. We are left with the goal rotr (S (nA + nC) ) (Node nA A b nC C) . 
Splitting along A results in: 

rotr (S (D + nC)) (Node D Leaf b nC C) 

rotr (S((S(nX+nZ))+nC)) (Node (S(nX+xZ)) (Node nX X y nZ Z) b nC C) 

immediately covered by the second and the third rule respectively. 

Since we started with the initial goal rotr n t and since the definition of rotr preserves 
reducibility, it is complete. 

When the coverage checking algorithm stops with CE ^ 0, we cannot deduce that R is 
complete. The set CE contains potential counterexamples. They can be true counterexam- 
ples, false counterexamples, or goals for which splitting failed along all safe variables, due to 
incompleteness of the unification algorithm. In some cases further splitting of a false coun- 
terexample may result in reducible goals or in the elimination of the goal as uninhabited, 
but it may also loop. Some solutions preventing looping (finitary splitting) can be found 
in pg. 

Unfortunately splitting failure due to incompleteness of the unification may happen 
while checking coverage of a definition by case analysis over complex dependent inductive 
types (for example trees of size 2), even if rules for all constructors are given. Therefore, it 
is advisable to add a second phase to our algorithm, which would treat undefined output 
of unification as success. Using this second phase of the algorithm, one can accept all 
definitions by case analysis that can be written in Coq. 
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Example 5.21. Let g be a function denned by case analysis and let g } be its version 
denned by rewriting (without the impossible Leaf case): 

Definition g (t : tree (S (S 0))) : bool := match t with 

Leaf false 
I Node _ Leaf true 
I Node _ (Node _____) false 
end. 



Symbol g' : tree (S (SO)) — > bool 
Rules 

g' (Node ?1 Leaf b ?2 t J ) — ► true 

g> (Node ?1 (Node ?2tb?3t') b' ?4t") — ► false 



Our algorithm starts with the goal g' x, splits it along x, easily detects that Leaf 
case is not possible but is stuck on Node n t b m t ' , because this requires deciding that 
S (n+m) is unifiable with S (SO), which may be too hard for a unification algorithnj^J In 
that case the initial goal g' x becomes a potential counterexample. 

Accepting all definitions by case analysis. The second phase of our algorithm would 
start only for the goals with safe splitting variables, i.e. where regular splitting failed because 
the unification was too weak. In this phase, the splitting would become lax by treating ? 
unification result as successful and returning simple substitutions a c = {x i— ► c(z)} for 



such cases (see Definition 5.3). As a result the goals would not be well-typed sequents 
anymore, which has to be taken into account by the unification algorithm. On the other 
hand typability is not required for splitting matching and the rest of the algorithm which 



would work just like described in Definition 5.17 Both arguments of termination and 
correctness of the algorithm would hold. 

Going back to our example. Redoing the lax splitting on x in the goal g' x, one gets 
again that Leaf is impossible, but Node is now accepted and leads to an (untyped) goal 
g' (Node n t b m t'). Splitting on t is now successful for both constructors and both 
resulting goals get reduced. 



6. More examples 



6.1. Heterogeneous equality. Consider the inductive predicate JMeq of heterogeneous 
equality with its non-standard elimination rule: 

Inductive JMeq (A:Set) (a:A) : forall B:Set, B — * Set := JMrefl: JMeq A a A a. 

Symbol JMelim : forall (A: Set) (a:A) (P: forall b:A, JMeq A a A b -> Set), 

P a (JMrefl A a) -> forall (b: A) (e: JMeq A a A b) , (P b e) 

Rules 

JMelim A a P h a (JMrefl A a) — * h 

2 Note that a better unification algorithm could find the two most general solutions n=D, m=(S 0) and 
n=(S 0), m=0. Then splitting would result in two goals immediately covered by rules for g'. 
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One splitting of JMelim A a P h b c over c results in JMelim A a P h a (JMref 1 A a) 

which is equal to the left-hand side of the rule. Hence this rule completely defines JMelim. 

6.2. Uniqueness of Identity Proofs and Streicher's axiom K. Consider the type eq 
and the definition of function UIP, proving that identity proofs are unique: 

Inductive eq (A:Set) (a:A) : A — * Set := refl: eq A a a. 

Symbol UIP : forall (A:Set)(a b:A)(p q: eq A a b) , (eq (eq A a b) p q) 
Rules 

UIP A a a (refl A a) (refl A a) — * refl (eq A a a) (refl A a) . 

The function UIP is completely defined since two subsequent splittings of UIP A a b p q, 
along p and along q, result in UIP A a a (refl A a) (refl A a) which is exactly the 
left-hand side of the only rule for UIP. 

The rule for Streicher's axiom K can also easily be proved complete: 

Symbol K : forall (A:Set) (a:A) (P:eq A a a — ► Set), 

P (refl A a) — > forall p: eq A a a, P p 
Rules 

K A a P h (refl A a) — > h 
Note that both rules for UIP and K can also be written in a left-linear form: 

UIP A a ?1 (refl ?2 ?3) (refl ?4 ?5) — ► refl (eq A a a) (refl A a) 
K A a P h (refl ?1 ?2) — ► h 



6.3. Non pattern matching rules. These are two examples of complete definitions which 
do not follow the pattern matching schemes as defined in [12] and |16] . 

Symbol or' : bool — ► bool — > bool 
Rules 

or' x x — > x 

or* true y — ► true 

or* x true — ► true 

Symbol It, diff : nat — ► nat — * bool 
Rules 

It y — ► diff D y 

It x D — ► false 

It (S x) (S y) — * It x y 

diff x x — ► false 

diff D (S y) — ► true 

diff (S x) — ► true 

diff (S x) (S y) — > diff x y 
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7. Conclusions and Related Work 

In this paper we study consistency of the calculus of constructions with rewriting. More 
precisely, we propose a formal system extending an arbitrary PTS with inductive definitions 
and definitions by rewriting. Assuming that suitable positivity and acceptance conditions 
guarantee termination and confluence, we formalize the notion of a complete definition by 
rewriting. We show that in every environment consisting only of inductive definitions and 
complete definitions by rewriting there is no proof of (x : *)x. Moreover, we present a 
sound and terminating algorithm for checking completeness of definitions. It is necessarily 
incomplete, since in presence of dependent types emptiness of types trivially reduces to 
completeness and the former is undecidable. 

Our coverage checking algorithm resembles the one proposed by Coquand in [12] for 
Martin-L6f type theory and used by McBride for his OLEG calculus [16J. In these works 
the procedure consisting in successive case-splittings is used to interactively built pattern 
matching equations, or to check that a given set of equations can be built this way. Unlike in 
our paper, Coquand and McBride do not have to worry whether all instances of a reducible 
subgoal are reducible. Indeed, in [12] pattern matching equations are meant to be applied 
to terms modulo conversion, and in [16] equations (or rather the order of splittings in the 
successful run of the coverage checking procedure) serve as a guideline to construct an 
OLEG term verifying the equations. Equations themselves are never used for reduction 
and the constructed term reduces according to existing rules. 

In our paper rewrite rules are matched against terms modulo o-conversion. Rewriting 
has to be confluent, strongly normalizing and has to preserve reducibility. Under these 
assumptions we can prove completeness for all examples from [12J and for the class of 
pattern matching equations considered in [TB]. In particular we can deal with elimination 
rules for inductive types and with Streicher's axiom K. Moreover, we can accept definitions 
which depart from standard pattern matching, like rotr and +. 

The formal presentation of our algorithm is directly inspired by the work of Pfenning 
and Schurmann [18 . A motivation for that paper was to verify that a logic program in 
the Twelf prover covers all possible cases. In LF, the base calculus of Twelf, there is 
no polymorphism, no rewriting and conversion is modulo /^-conversion. The authors use 
higher-order matching modulo /3r/-conversion, which is decidable for patterns a la Miller and 
strict patterns. Moreover, since all types and function symbols are known in advance, the 
coverage is checked with respect to all available function symbols. In our paper, conversion 
contains rewriting and it cannot be used for matching; instead we use matching modulo 
a. This simplifies the algorithm searching for safe splitting variables, but on the other 
hand it does not fit well with instantiation and normalization. To overcome this problem 
we introduce the notions of normalized canonical instance and preservation of reducibility 
which were not present in previously mentioned papers. Finally, since the sets of function 
symbols and rewrite rules grow as the environment extends, coverage is checked with respect 
to constructors only. 

Even though the worst-case complexity of the coverage checking is clearly exponential, 
for practical examples the algorithm should be quite efficient. It is very similar in spirit 
to the algorithms checking exhaustiveness of definitions by pattern matching in functional 
programming languages and these are known to work effectively in practice. 

An important issue which is not addressed in this paper is to know how much we 
extend conversion. Of course it depends on the choice of conditions ACC and POS and on 
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the unification algorithm used for coverage checking. In particular, some of the definitions 
by pattern matching can be encoded by recursors |13j , so if ACC is strict, we may have no 
extension at all. In general there seems to be at least two kinds of extensions. The first 
are non-standard elimination rules for inductive types, but the work of McBride shows that 
the axiom K is sufficient to encode all other definitions by pattern matching considered by 
Coquand. The second are additional rules which extend a definition by pattern matching 
(like associativity for +). It is known that for first-order rewriting, these rules are inductive 
consequences of the pattern matching ones, i.e. all their canonical instances are satisfied as 
equations (see e.g. Theorem 7.6.5 in [19 ). Unfortunately, this is no longer true for higher- 
order rules over inductive types with functional arguments. Nevertheless it seems that such 
rules are inductive consequences of the pattern matching rules if the corresponding equality 
is extensional. 

Finally, our completeness condition COMP verifies closure properties defined in |9[ [10] . 
Hence, it is adequate for a smooth integration of rewriting with the module system present 
in Coq since its version 7.4. 
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